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HTTP/2 all the things! 

ch al I enges, opportunities, and the exciting world ahead ofus... 
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@igrigorik 




Who's this guy?:-) 



• Performance Engineer @ Google 

o Anything web perf related... 

• Wrote HPBN (read @ hpbn.co) 

o Radio -> TCP -> TLS -> HTTP 
o Browser APls: XH R, WS, WebRTC 

o ... 

• Blog: igvita.com 

• Twitter: (cpigrigorik 
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$> telnet igvita.com 80 
Connected to 173.230.151.99 

GET /archive 

Hypertext delivery with HTTP 0.9! - eom. 
(connection cLosed) 



HTTP 0.9 is the ultimate MVP - one line, plain-text 
"protocoi" to test drive the "www idea". 



$> telnet ietf.org 80 
Connected to 74.125.xxx.xxx 

GET /rfc/rf cl945.txt HTTP/1.0 

User-Agent: CERN-LineMode/2. 15 libwww/2 . 17b3 
Accept: */* 

HTTP/1.0 200 OK 

Content-Type: text/plain 
Content-Length: 137582 

Last-Modified: Wed, 1 May 1996 12:45:26 GMT 
Server: Apache 0.84 

4 years of rapid iteration later... eom. 
(connection cLosed) 



HTTP 1.0 is an informational RFC - documents 
"common usoge" of HTTP found in the wild. 



$> telnet google.com 80 
Connected to 74.125.xxx.xxx 

GET /index. html HTTP/1.1 
Host: website.org 

User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10_7_4) . . . (snip) 
Accept : text/html, application/xhtml+xml, application/xml; q=0.9, */*;q=0.8 
Accept- Encoding: gzip, def late, sdch 
Accept-Language: en-US, en;q=0.8 
Cookie: qca=PØ-800083390. . . (snip) 

HTTP/1.1 200 OK 
Connection: keep-alive 
Transfer-Encoding: chunked 

Server: nginx/1.0.11 
Content-Type: text/html; charset=utf -8 
Date: Wed, 25 Dul 2012 20:23:35 GMT 
Expires: Wed, 25 Jul 2012 20:23:35 GMT 
Cache-Control : max-age=0, no-cache 

100 

<!doctype html> 
(snip) 



HTTP 1.1 ships as RFC standard in 1999 - hyper 
{textjmedia all the things! 



In the meantime.. 




Geocities ftw! 
(circa HTTP/1.1) 



Web applications, not Qust) pages. 
Rich media and multi-device layouts. 



State of the HTTP nation... 



• 12 distinct hosts per page 

• 78 distinct requests per page 

• 1 ,232 KB transferred per page 



Resulting in typical render times o f 2.6-5.6 seconds. 

T 



50th and 90th percentiles 



{\ * All numbers are medians, based on latest HTTP Archive crawl data . 



Yahoo.com waterfall... BW is not the issue? 
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52 requests 
4+ seconds 
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Primer on Web Performance (Chapter 1 0) 



"Connection view" tells the story... 



http ://www .yahoo .com 
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DNS Lookup 



Initial Correctbr 



Time to First Byte 



Cortert Dowrload 



| Start Rerder 



| Documert Complete 



3xx result 



30 connections 

• DNS lookups 

• TCP handshakes 



Transfer time (in blue) 



We 're not BW limited, we're 
literally idling, waiting on the 
network to deliver resources. 



Total CPU time: 654,697s 
Total page load time: 2,149,369s 
Average CPU time: 735ms 
Average page load time: 2,413ms 



Network 

EvaluateScript 

Layout 

Paint 

Program 

FunctionCall 

ParseHTML 

ResourceReceivedData 

RecalculateStyles 

Decodelmage 

GCEvent 

CompositeLayers 
Resizelmage 
TimerFire 
Other 




Network: 
EvaluateScript : 
Layout : 
Paint : 



1494671a 69.5% 

141658s 6.6% 

109802s 5.1% 

96955s 4.5% 



blink-devthread 



Top 1M Alexa sites— 

• Cable profile (5Mbps / 28 ms RTT) 

• Main thread attribution in Blink 

o Measured via Telemetry 

• 69.5% of time blocked on network 

• 6.6% of time blocked JavaScript 

• 5.1 % blocked on Layout 

• 4.5% blocked on Paint 



No surprises Ziere... First page load is 
network (latency) bound! 



HTTP/1.1 performance problems... 



client 

open -t- 



server 



close -- 




Limited parallelism 

o Maximum of6 requests per origin 

o Pipelining does not work in practice 

o Competing TCP flows, spurious retransmissions 

o Extra handshakes, FDs, memory buffers, etc. 

Client-side request queuing 

o Head-of-line blocking 
o Delayed request dispatch 

High protocol overhead 

o ~S00 bytes of header + cookies 
o No compression of HTTP metadata 



Where there's a will, there's a way... 

we're an inventive bunch, so we came up with some "optimizations" (read, hacks") 



Domain shard... all the things! 



Etsy Kitten Search ( 

" ).l »condl 



Oylti al »n <wv»lc«0.1 



ISAtt 




eooo;* 000010 oooo w 



6 connections per origin 
just add more origins, right? 



Duplicate (spurious) data packets 
due to oversharding 



Optimal number ofshards? There is no such thing. Depends on particular 
page, device + network + network weather. Most sites overshard, and hurt 
themselves... Causing congestion, retransmissions, etc. 



http://perf.fail/post/96104709544/zealous-shardinq-hurts-etsy-perf 



ormance 



Concat... all the things! 



Di 
□ 




"Reduce number of requests"... 

• Large monolithic code chunks 

o e.g. most pages use <20% of CSS ru les 

• Expensive cache invalidations 

o e.g. single char update forces full fetch 

• Delayed execution of JSS / CSS 

o e.g. must wait for entire JS file to arrive 
o e.g. must wait for entire CSS file to arrive 



Inline... all the things! 

"Reduce number of requests"... 

• Duplicated resources 

o every page must embed the same resource 
o can't use the HTTP cache 

• Breaks prioritization 

o inlined asset is "upgraded" to HTML priority 
o inflates the size o f HTML document 




Let's fix HTTP instead? 



"HTTP 2.0 is a protocol des ig ned for low-latency 
transport ofcontent over the World Wide Web" 

• Improve end-user perceived latency 

• Address the "head of line blocking" 

• Not require multiple connections 

• Retain the semantics of HTTP/1 .1 



HTTP/2 in one slide 



• One TCP connection 

• Request — > Stream 

o Streams are multiplexed 
o Streams are prioritized 

• Binary f raming layer 

o Prioritization 
o Flow control 
o Server push 

• Header compression 



Application (HTTP 2.0) 
Binary Framing 



Session (TLS) (optional) 



Transport (TCP) 



Network (IP) 



HTTP 1.1 





POST /upload HTTP/1.1 
Host: www.example.org 
Content-Type: application/json 
Content-Length: 15 


u 




{"msg":"hello"} 




HTTP 2.0 






HEADERS frame 


< — 


— ► 


DATA frame 





"... we're not replacing all ofHTTP - the methods, 
status codes, and most ofthe headers you use today 
will be the same. Instead, we're redefining how it gets 
used "on the wire" so it's more efficient, and so that it 
is more gentle to the Internet itself ..." 



- Mark Nottingham (HTTPbis chair) 



Basic data flow in HTTP 2.0... 



HTTP 2.0 connection 




Client 



stream 1 


stream 3 


stream 3 


stream 1 




DATA 


HEADERS 


DATA 


DATA 





stream 5 
DATA 




Server 



Streams are multiplexed by splitting communication into frames 
o All frames (e.g. HEADERS, DATA, etc) are sent over single TCP connection 



Frames are interleaved 

o Frames are prioritized 



o 



Frames are flow controlled 



o 



Server push... is replacing inlining 



HTTP 2.0 connection 




stream 4 




stream 1 


stream 4 


stream 2 


frame 1 


• • • 


frame n 


promise 


promise 




stream 1 
frame 2 




stream 1: /page. html 
stream 2: /script, j s 
stream 4:/style.css 



(client request) 
(push promise) 
(push promise) 



Inlining is server push. Except, HTTP 2.0 server push is cacheable 



HTTP/2 header compression 



Request #1 



.metnod 






.senerne 


nttps 




.nost 


exampie.com 




:path 


/resource 




accept 


image/jpeg 




user-agent 


Mozilla/5.0 ... 




H EADERS franSrStream 1 ) 




:method: GET 
:scheme: https 
:host: example.com 
:path: /resource 
accept: image/jpeg 
user-agent: Mozilla/5.0... 





* as /oi/i/ os 9 bytes for an identical request 



• Bof/7 s/des maintain "header tables" 

• A/ew requests "toggle" or "insert" 
new values into the table 




min(request overhead) = 9 bytes 

max(parallelism) = 100~1000+ streams 

max(client queueing latency) = 0 ms 



■ 



But you already knew all that! 

The more interesting part is how it changes web development. 




Remove domain sharding for HTTP/2 



Sharding hurts HTTP/2 performance 

• Breaks prioritization, flow control, et c. 

$> openssl s_client -connect google.com: 443 | 
openssl x509 -noout -text | 
grep DNS 

DNS : * . google . com , DNS : * . android . com , DNS : * . appengine . google . com , . . . 

Tip: use altName hosts to deploy domain sharding! * 

• HTTP/1 . 7 — ► opens new connection to each origin 

• HTTP/2 — ► reuses the same connection for altName origins 



* Origin must be covered by the cert and resolve to same IP 



Remove spriting / concatenation logic— 



Streams are cheap, and no longer a constraint. 

• Deliver modulår resources 

o aim to minimize resource churn 

o define granular caching strategy for each 

• Conditional delivery based on protocol? 

o Combine for HTTP/1 . 7 clients * 

o Granular resources for HTTP/2 clients 



* Need better tools / infrastructure to do conditional delivery 




Leverage server push instead of inlining... 

Server can respond with multiple replies! 

• Client — ► I want /product/xyz 

• Server — ► 0k ; and you'll also need... style.css 

• Pushed resource is cached independently 
o Use "smart push", don't push on every request 

• Can remove RTT+ from critical path 

• Push... cache invalidations! 

o push a "tombstone" record to invalidate 




Jetty's "smart push" is a great strategy... 



1 . Server observes incoming traffic 

a. Build a dependency model based on Referer 

b. e.g. index.html — ► {style.css, app.js} 

2. Server initiates push for learned dependencies 

a. new client — ► G ET index.html 

b. server —> Push style. css, app.js 

Lots ofroom for experimentation + innovation! 




Servers need to be *much* sm arter 

client is relinquishing a lot ofcontrol, bad ly implemented server — ► poor performance 



Chrome 28+ does not delay stream dispatch (yay) 



Visual Progress - Dev Tools (%) 



100 





1 1 : chrome26 
I 2: chrome29 



"Dorit delay low priority requests 
when SPDYis available. Check if 
the origin server supports SPDY. 
Ifso, 

start the request immediately. " 



Eliminates client queuing 
latency. Means the server must 
be smart about respecting 
client priorities! 



^\ SPDY resource scheduling 



Prioritization is key to optimized rendering... 



With HTTP/1 .1 browsers held back requests... not with HTTP/2. 

o G ET index.html, style.css, hero.jpg, other.jpg, more.jpg, ... 



critical low priority 



Critical resources should pre-empt others 

o Poorly implemented server: saturate the pipe with static image bytes! 
m e.g. SPDY/v2 implementation in nginx did not respect prioritization, and 
performance suffered... test your server! 



Smart++ server can optimize for each content type! 

Don't hold back all image bytes... send the first -KB 

o Allows the browser to decode the image header and get dimensions 
o Allows the browser to minimize reflows d uring layout 

Stream flow-control enables fine-grained resource control between streams. E.g... 

• T(0): I am willing to receive 4KB of kittens.jpg. 

• T(0): I am willing to receive 500KB of critical.js 

• ... 

• T(n): Ok, now send the remainder of kittens.jpg. 

Client controls how and when the stream and connection window is incremented! 



Real- wotld performance. 

Your gains will varybased on site architecture, server, clients, ... 



SPDY for API traff ic @ Twitter 
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"However, we have measured as much as a 30% decrease in latency in the wild for API requests 
carried over SPDY relative to those carried over HTTP. In particular, iveVe observed SPDY 
helping more as a user's network conditions get worse. " - Twitter 



https://bloq.twitter.com/201 3/cocoaspdv-spdy-for-ios-os-x 



HTTP/2 and SPDY 





Page load time improvement with SPDY enabled... 




Google News 


Google Sites 


Google Drive 


Google Maps 


Median 


43% 


27% 


23% 


24% 


95th percentile 


44% 


33% 


36% 


28% 



T 



Improvement over HTTP/1 .1 + TLS 



http://bloq.chromium.org/201 3/1 1 /makinq-web-faster-with-spdy-and-http2.html 



"SPDYalso has adv 'antages on the server: 



SPDY requests consume less resources on the server 
SPDY requests consume less memory but a bit more CPU 

SPDY requires fewer Apache worker threads" 



Hervé Servy, Neotys. 



>Y/HTTP2/q ... same results. 



Speaking of TLS. 

make sure your TLS stack is optimized! 





Tuning Nginx TLS Time To First Byte (TTTFB) 
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nginx 1.4.4 




1873 ms 

0.4 0.8 1.2 1.4 1.6 1.8 2.0 2.2 2.4 2.6 



Pre 1 .5.7: bug for 4KB+ certs, resulting in 3RTT+ handshakes 
1 .7.1 added ssl_buffer_size: 4KB record size remove an RTT 
1 .7.1 with NPN and forward secrecy — ► 1RTT handshake 



{\ https://www.iqvita.com/201 3/1 2/1 6/optimizinq-nqinx-tls-time-to-first-byte/ 
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• "Out of the box" TLS performance is poor... we need to fix this. 

• No server is perfect, plenty of work to be done to improve perf . 



o 
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There is way too much red here... Bug your CDN about f ixing this! 



An optimized TLS deployment should. 




Deliver 1 -RTT handshake 1 00% of the time 

1 . TLS False Start for new visitors 

2. TLS resumption for returning visitors 

3. Ensure that server is able to send full cert chain without blocking 

4. OCSP stapling to avoid blocking 




Optimize data delivery 

1 . Optimize record size to avoid unnecessary buffering delays 

2. Leverage SPDY / HTTP/2 to further reduce latency and ops costs 
a. Leverage HTTP/2 optimizations: unshard, un-concat, etc 



isTLSfastyet.com 




G Q https://istlsfastyet.com 



TLS has exactly one performance problem: 
it is not used widely enough. 

Everything else can be optimized. 



Data delivered over an unencrypted channel is insecure, untrustworthy, and trivially intercepted. We 
owe it to our users to protect the security, privacy, and integrity of their data — all data must be 
encrypted while in flight and at rest. Historically, concerns over performance have been the common 
excuse to avoid these obligations, but today that is a false dichotomy. Let's dispel some myths. 



necessary steps to make HTTP/2 ubiquitous 




Browser support is there, or coming soon... 

• Chrome M39 is shipping HTTP/2 (draft 1 4) 

o Coming in next stable release! Available in Canary today. 
o Google servers are also speaking HTTP/2 

• Firefox 34 is shipping HTTP/2 (draft 1 4) 

o Coming in next stable release! 

• IE supports HTTP/2 on Windows 1 0 Technical Preview 

o In the meantime, IE also supports SPDY v3 

• Latest Safari suports SPDY v3 

o No official HTTP/2 announcements, but... I'm sure its coming. 



Wait. what about SPDY? 




SPDY was "experimental branch" of HTTP/2 
SPDY will be phased out now that we have HTTP/2 

o All future and f urther work will be done within HTTP/2 



Server support is coming along as well 



• nghttp2 is awesome! 

o Lots ofprojects built on top ofnghttp2 
o Need to test TLS performance though.. :) 

• Native Java, C#, Objective-C, Go, Python, Ruby, Erlang libraries 

o https://pithub.com/http2/http2-spec/wiki/lmplementations 

• Apache and Nginx are both WIP 

o No ETA for either project asof today 

o Looking for a good project to contribute to? 



tl;dr. 





Site owners & developers 

1 . Remove sharding, concatenation, spriting 

2. Test your HTTP/2 server: prioritization, server push, etc, 

3. Optimize your TLS deployment 



Server & library developers 

1 . Respect prioritization and dependency hints 
a. Aside: we need better server tests - QPS is not a good metric! 

2. Build smarter models: server push, content-type optimizations, etc 

3. Nudge / contribute HTTP/2 support in your favorite project / language 



Slides 

bit.ly/l rOWzXj 



Thanks! 





+llya Grigorik 

@igrigorik 
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